Fast Discovery of Frequent Itemsets: a Cubic Structure-Based Approach

نویسندگان

  • Renata Iváncsy
  • István Vajk
چکیده

Mining frequent patterns in large transactional databases is a highly researched area in the field of data mining. The different existing frequent pattern discovering algorithms suffer from various problems regarding the computational and I/O cost, and memory requirements when mining large amount of data. In this paper a novel approach is introduced for solving the aforementioned issues. The contribution of the new method is to count the short patterns in a very fast way, using a specific index structure. The suggested algorithm is partially based on the apriori hypothesis and exploits the benefit of a new index table-based cubic structure to count the occurrences of the candidates. Experimental results show the advantageous execution time behavior of the proposed algorithm, especially when mining datasets having huge number of short patterns. Its memory requirement, which is independent from the number of processed transactions, is another benefit of the new method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

روشی کارا برای کاوش مجموعه اقلام پرتکرار در تحلیل داده‌های سبد خرید

Discovery of hidden and valuable knowledge from large data warehouses is an important research area and has attracted the attention of many researchers in recent years. Most of Association Rule Mining (ARM) algorithms start by searching for frequent itemsets by scanning the whole database repeatedly and enumerating the occurrences of each candidate itemset. In data mining problems, the size of ...

متن کامل

A New Compact Structure to Extract Frequent Itemsets

Discovery of association rules is an important problem in KDD process. In this paper we propose a new algorithm for fast frequent itemset mining, which scan the transaction database only once. All the frequent itemsets can be efficiently extracted in a single database pass. To attempt this objective, we define a new compact data structure, called ST-Tree (Signature Transaction Tree), and a new ...

متن کامل

LCM over ZBDDs: Fast Generation of Very Large-Scale Frequent Itemsets Using a Compact Graph-Based Representation

(Abstract) Frequent itemset mining is one of the fundamental techniques for data mining and knowledge discovery. In the last decade, a number of efficient algorithms for frequent itemset mining have been presented, but most of them focused on just enumerating the itemsets which satisfy the given conditions, and it was a different matter how to store and index the mining result for efficient dat...

متن کامل

Fast Algorithms for Mining Generalized Frequent Patterns of Generalized Association Rules

Mining generalized frequent patterns of generalized association rules is an important process in knowledge discovery system. In this paper, we propose a new approach for efficiently mining all frequent patterns using a novel set enumeration algorithm with two types of constraints on two generalized itemset relationships, called subset-superset and ancestor-descendant constraints. We also show a...

متن کامل

Graph Based Approach for Finding Frequent Itemsets to Discover Association Rules

The discovery of association rules is an important task in data mining and knowledge discovery. Several algorithms have been developed for finding frequent itemsets and mining comprehensive association rules from the databases. The efficiency of these algorithms is a major issue since a long time and has captured the interest of a large community of researchers. This paper presents a new approa...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Informatica (Slovenia)

دوره 29  شماره 

صفحات  -

تاریخ انتشار 2005